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15.  Supplementary  Note* 


This  report  updates  results,  described  in  a previous 
interim  report,  of^8uF?tef forts  to  develop  short  range 
(0-6  hr)  thunderstorm  forecasts  for  aviation. 


In  the  0-2  hr  range, /we  have  made'Jsystematic  comparisons 
i-y*vA Xt  of  the  capabilities  of  three  techniques  of  varying 

complexity  in  predicting  the  movement  of  radar  echoes 
[v^associated  with  thundery  activity.-)  We  used  10-  and 
awA  30-minute  da 1 1 'sequence s of  radaF  dataj to  produce 

10— , 30-,  60-,  and  90-minute  forecasts./^ Our^ results 
show  that  in  general,  the  complex  technique  hds  little 
advantage  over  simple  techniques  which  can  be  implemented 
locally  on  the  mini-computer. 


In  the  2-6  hr  range have  usecDla  combination  of 
classical  and  model  output  statistics  (M0S)j^to  develop 
probability  forecasts  of  thunderstorm  activity  over 
most  of  the  U.  S.  east  of  the  Rockies  ..forecasts 
valid  for  the  periods  1700-2100,  2000-0600,  and  2300- 
0300  GMT  are  now  available  for  the  spring  and  summer 
seasons  and  are  being  transmitted  to  the  field  three 
times  dally  by  teletype.  A 
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1.  INTRODUCTION 


In  a previous  report  (Alaka,  et  al.,  1975),  we  described  preliminary 
results  of  a research  and  development  effort  to  improve  aviation  weather 
forecasts  performed  under  Interagency  Agreement  No.  DOT  FA74WAI-488  between 
the  Federal  Aviation  Administration  (FAA)  and  the  National  Weather  Service 
(NWS).  The  immediate  aim  of  this  effort  is  to  develop  improved  objective 
forecasts  of  local  convective  weather,  particularly  thunderstorms,  at 
high  density  terminals  and  surrounding  airspace.  The  effort  falls 
within  the  broad  purpose  of  the  agreement,  which  is  to  develop,  test, 
and  implement  improved  forecast  techniques  of  aviation  weather  in  the 
0-6  hour  time  range. 

The  time  range  and  content  of  aviation  forecasts  must  meet  the  operational 
requirements  of  both  enroute  and  terminal  controllers  as  well  as  flow 
controllers  and  pilots.  The  center  and  terminal  controllers  who  are 
in  direct  contact  with  the  pilot  need  sufficient  detail  to  plan  for 
deviation  requests  and,  when  possible,  to  give  avoidance  advisories  to 
aircraft  under  their  control.  They  are,  therefore,  interested  in 
current  hazardous  conditions  and  the  changes  likely  to  occur  in  these 
conditions.  Timely  forecasts  with  projections  of  10  minutes  to  2 hours 
will  best  suit  the  needs  of  the  terminal  controller. 

Flow  controllers,  on  the  other  hand,  are  more  concerned  with  the  number 
of  aircraft  a given  route  or  terminal  can  accommodate.  Therefore,  they 
need  to  know  the  meteorological  conditions  which  will  exist  in  both  the 
enroute  and  terminal  areas.  Although  most  flights  are  of  comparatively 
short  duration,  some  last  several  hours.  To  enable  flow  controllers 
to  utilize  the  airspace,  most  effectively,  forecasts  of  several  hours 
over  a comparatively  large  area  are  needed.  This  will  enable  the 
controller  to  anticipate  and  plan  for  expected  traffic  flow  adjustments 
caused  by  deteoriorating  weather. 

To  satisfy  the  needs  of  the  above  two  categories  of  users,  we  have 
divided  our  total  effort  into  two  main  tasks,  the  first  dealing  with 
"very  short"  (0-2  hour)  forecasts  and  the  second  with  "short"  (2-6  hour) 
forecasts.  This  division  is  convenient  since  a different  approach  is 
best  suited  for  each  forecast  range. 

2.  VERY  SHORT  (0-2  hr)  FORECASTS 

One  of  the  best  practical  ways  (if  not  the  only  way)  of  determining,  with 
sufficient  detail  and  timeliness,  the  location  movement,  and  development 
of  convective  weather,  hazardous  to  aviation,  is  to  monitor  the  associated 
radar  echoes.  We  therefore  decided  that  observations  from  WSR-57  radars 
should  constitute  our  main  data  base  for  these  very  short  forecasts.  In 
particular,  we  have  used  reflectivities  from  these  radarB,  which  have  been 
digitized  into  discrete  intensity  levels,  to  provide  a basis  for  their 
quantitative  application.  The  aim  is  to  use  these  digitized  radar  obser- 
vations to  develop  automated  techniques  for  identifying,  tracking,  and 
extrapolating  the  motion  and  development  of  local  convective  weather 
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systems  at  and  around  air  terminals.  Since  timeliness  is  of  the  essence, 
we  have  placed  due  emphasis  on  techniques  which  can  be  readily  implemented 
with  locally  available  facilities. 

In  recent  years  several  studies  have  been  made  to  determine  the  predictability 
of  radar  echo  motion.  A summary  of  such  studies  was  given  in  a preliminary 
report  (Alaka,  et  al. , 1975)  and  more  recently  by  Elvander  (1976).  So  far, 
no  systematic  effort  has  been  made  to  determine  the  comparative  merit  of  the 
different  techniques  used  by  different  investigators. 

2.1  Techniques  selected  for  study 

For  the  purpose  of  this  study  we  have  selected  three  methods 
of  various  degrees  of  complexity.  The  first  consists  of 
obtaining  a displacement  vector  for  the  entire  PPI.  This  may 
be  done  by  either  a cross-correlation  or  error  minimization 
technique.  The  procedure  described  by  Austin  and  Bellon  (1974) 
is  an  example  of  the  cross-correlation  method  which  we  shall 
call  the  CCM  model.  In  this  model,  all  data  above  a pre- 
determined threshold  are  used  in  the  computations.  Individual 
echoes  are  not  considered.  Two  successive  PPI's  are  space 
lagged  with  respect  to  one  another  and  correlation  coefficients 
are  computed.  The  technique  uses  a methodology  whereby 
relatively  small  amounts  of  computations  are  necessary.  The 
correlation  subroutine  is  used  twice.  The  first  time  only 
every  other  row  and  column  of  data  are  used.  A rough  estimate 
of  the  location  of  the  best  lags  is  determined  from  these 
computations.  The  possible  lags  extend  to  a maximum  of 
5 grid  lengths  in  the  east-west  and  north-south  directions. 

During  the  second  time  through  the  correlation  subroutine, 
all  the  data  are  used,  but  only  deviations  of  two  grid  lengths 
east-west  and  north-south  are  allowed  from  the  lags  given  by 
the  rough  estimate.  Ground  clutter  and  data  beyond  the  range 
of  120  nmi  are  excluded  from  the  computations. 

The  second  method  is  to  track  and  extrapolate  the  motion  of 
echo  centroids.  This  procedure  was  described  by  Barclay  and 
Wilk  (1970),  and  Wilk  and  Gray  (1970);  we  shall  refer  to  it 
as  the  LLS  method.  The  LLS  technique  is  almost  self-explanatory. 
Successive  positions  of  echo  centroids  are  used  to  compute 
linear  x-  and  y-  displacement  equations  by  the  method  of  least 
squares.  The  echo  centroids  are  then  extrapolated  according 
to  the  required  time  interval.  Computations  are  allowed  only 
if  three  or  more  past  centroid  positions  are  available.  The 
entire  echo  is  moved  according  to  the  nearest  whole  grid 
length.  Forecasts  of  echo  centroid  positions  are  verified 
by  observations,  and  the  resulting  statistics  assembled. 

These  data  are  all  in  standard  decimal  form.  However,  it  is 
the  area  of  coverage  of  echoes  of  a given  Intensity  that  is 
verified  in  these  experiments. 

The  third  method  tracks  individual  echoes  by  first  considering 
the  entire  echo  complex  and  then  making  adjustments  for 
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individual  echoes  defined  in  accordance  with  certain  criteria. 
This  method  was  developed  at  the  Stanford  Research  Institute 
(SRI)  under  contract  with  the  National  Weather  Service  and 
was  described  by  Duda  and  Blackmer  (1972)  and  Blackmer, 

Duda  and  Reboh  (1973).  We  shall  refer  to  it  as  the  SRI 
model. 

The  SRI  model  isolates,  tracks,  and  forecasts  the  motion  of 
individual  echoes.  A contouring  technique  is  used  to  isolate 
these  echoes.  Data  in  a given  row  are  scanned  forward  and 
backward,  and  grid  points  with  an  intensity  below  a pre- 
determined threshold  are  set  to  zero.  Then  rows  immediately 
above  and  below  are  similarly  scanned.  Directed  line  segments 
from  one  zero  value  to  another  constitute  a boundary.  However, 
isolated  zero  values  which  are  surrounded  by  non-zero  values 
are  not  considered  in  delineating  the  boundary.  An  echo  is 
defined  as  the  area  within  the  boundary.  The  extreme  rows 
and  columns  where  non-zero  values  are  located  form  a rectan- 
gular "window"  surrounding  the  echo.  This  window  is  subse- 
quently used  in  the  tracking  program.  Pertinent  statistics 
are  computed  for  each  echo  identified  by  this  procedure. 

These  Include  the  significance  weight,  echo  size,  center  of 
mass,  center  of  echo  area,  maximum  intensity,  and  average 
intensity. 

The  tracking  program  consists  of  matching  successive  radar- 
scopes  (PPI's)  by  a cross-correlation  technique  which  involves 
minimizing  the  sum  of  the  absolute  differences  between  the 
intensity  digits  over  the  entire  radarscope  (global  matching) 
or  over  selected  windows  (local  matching).  This  is  done  by 
moving  one  radarscope  relative  to  the  other  until  the  best 
match  is  obtained. 

To  initiate  the  tracking  procedure,  significant  echoes  are 
identified  on  the  first  PPI  and  the  remaining  digits  are  set 
to  zero.  A global  match  is  then  made  between  the  filtered 
first  PPI  and  the  next  PPI.  The  displacement  thus  obtained 
is  applied  to  each  significant  window  in  the  first  PPI  and  is 
modified  by  local  matching.  The  displacement  obtained  from 
both  global  and  local  matches  is  then  used  as  the  actual 
displacement  of  the  echo.  Predicted  displacements  for  the 
next  echo  are  a weighted  average  of  the  previous  predicted 
and  latest  actual  displacements.  This  allows  past  history 
of  the  echo  motion  to  be  incorporated  in  the  forecast. 

Four  situations  may  occur  when  two  successive  PPI's  are 
matched:  the  significant  echoes  may  remain  essentially 

unchanged,  become  lost,  split,  or  merge.  If  they  split, 
the  largest  fragment  is  retained  as  the  old  echo.  If  they 
merge,  a check  is  made  to  ensure  that  each  new  echo  is 
different  from  all  other  echoes.  Lost  echoes  may  later 
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be  recovered.  Pertinent  statistics  are  computed  for  the 
successfully  matched  echoes  which  are  then  removed  from 
the  PPI. 

The  PPI  is  then  scanned  for  significant  new  echoes.  Often 
a new  echo  appears  to  be  in  the  path  of  an  echo  lost  during 
an  earlier  PPI  processing.  A check  is  made  to  make  sure  that 
this  is  Indeed  the  case.  Pairings  are  then  made  between  the 
old  lost  echoes  and  the  new  echoes  according  to  certain 
criteria  which  tend  to  minimize  the  number  of  spurious  new 
echoes  and  lead  to  more  consistent  tracks. 

If  a new  echo  is  obtained  without  any  previous  history,  it 
is  assigned  a predicted  displacement  that  is  a function  of 
the  surrounding  echoes.  Those  echoes  which  are  closest 
and  with  the  largest  tracking  histories  are  given  the  highest 
weight. 

The  displacements  used  to  make  forecasts  are  slightly 
different  from  those  used  in  the  tracking  portion  of  the 
model.  They  are  also  a weighted  average  of  the  previously 
predicted  displacement  and  the  last  actual  displacement. 

But  now,  echoes  with  a long  tracking  history  have  a higher 
weight  than  those  with  a short  lifespan.  The  entire  echo, 
isolated  during  the  last  PPI  used,  is  moved  during  the 
forecast  interval.  The  area  predicted  to  be  covered  by 
echoes  of  predetermined  intensities  is  verified  in  these 
experiments. 

We  have  adapted  versions  of  the  above  three  methods  to  test 
their  comparative  performance.  We  have  somewhat  relaxed 
the  original  SRI  echo  definition  criteria.  Thus  in  defining 
an  echo  of  a given  intensity  threshold,  we  allow  an  intensity 
value  of  one  (jlglt  less  than  the  threshold  to  exist  between 
two  consecutive  higher  values,  both  in  the  orthogonal  and 
<Jlagonal  directions. 

2.2  Data  used  in  the  study 

We  have  used  digitized  weather  radar  data  originally 
collected  and  archived  at  the  National  Severe  Storms 
Laboratory  (NSSL)  during  the  spring  season  of  1972.  The 
original  data  are  in  the  form  of  nine  intensity  digits 
representing  the  power  in  dbm  returned  by  the  target. 

The  data  were  collected  by  NSSL's  WSR-57  weather  radar  at 
Norman,  Oklahoma,  over  a range  extending  from  14  to  125  nmi, 
on  a polar  grid  of  1 nmi  radial  distance  and  2 deg  azimuth. 

We  have  transferred  these  data  by  quadratic  interpolation 
to  a cartesian  grid  of  120  x 120  points,  2 nmi  apart.  We 
have  also  represented  the  data  on  a coarser  60  x 60  grid 
of  squaree  4 ml  on  a side,  attributing  to  each  square 
the  maxlaHOB  reflectivity  value  observed  over  the  16  nmi* 
area. 
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We  have  used  Che  available  ref lectlvltlea  observed  at 
different  elevation  angles  to  compute  values  of  the 
vertically  Integrated  liquid  water  (VIL)  content.  As 
In  the  case  of  zero  degree  reflectivities  mentioned  above, 
we  have  represented  the  VIL  data  on  a 120  x 120  grid  and 
a 60  x 60  grid.  The  value  attributed  to  each  16  nmi^ 
square  In  the  coarser  grid  Is  the  maximum  value  found  In 
the  corresponding  four  4 nml*  squares  of  the  finer  grid. 

We  have  tested  the  three  methods  described  above  on  five 
storm  days  from  the  NSSL  spring  1972  data  set.  These  are 
April  13,  19,  20,  26,  and  May  22-23.  These  days  which  had 
varying  types  of  convective  weather,  severe  on  occasion,  are 
representative  of  conditions  to  be  expected  in  real-time 
operations.  No  stratiform  rain  type  situations  are  included 
in  this  study. 

2.3  Verification  Procedures 

As  mentioned  above,  different  methods  of  forecasting  radar 
echo  motion  have  not  previously  been  evaluated  according  to 
the  same  criteria,  thus  leading  to  difficulties  In  attempting 
to  assess  their  comparative  merit.  In  the  present  study, 
we  have  subjected  all  three  methods  selected  for  testing,  to 
the  same  verification  scores.  Table  1 provides  the  basis 
for  computing  the  scores.  A successful  forecast  (X)  is 
scored  when  both  predicted  and  observed  values  of  radar 
reflectivities  or  VIL  are  equal  to  or  larger  than  the 
predetermined  threshold.  A miss  (Y)  occurs  when  a value 
equal  to  or  larger  than  the  predetermined  threshold  is  observed 
but  not  forecast.  A false  alarm  (Z)  occurs  when  a value 
equal  to  or  larger  than  the  predetermined  threshold  is 
forecast  but  not  observed.  Cases  inyolvlng  correct  forecasts 
of  subthreshold  values  (W)  are  not  Included  in  the  verifi- 
cation statlatlcs. 


Table  1.  Contingency  table  defining  the  variables  used  In 
computing  the  verification  scores. 


Forecast 

Observed 

Value  equal  to 
or  larger  than 
threshold 

Value  less  than 
Threshold 

Value  equal  to 
or  larger  than 
threshold 

X 

Y 

Value  lees  than 
threshold 

z 

W 
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Following  are  the  scores  used  and  their  definitions: 


a.  Probability  of  Detection  (POD)  described  by 
Donaldson  et  al.,  (1975),  also  referred  to  by 
Panofsky  and  Brier  (1958)  as  "prefigurance." 

POD  **  X * (X  + Y)  (1) 

b.  False  Alarm  Ratio  (FAR),  (Donaldson  et  al.,  1975). 

FAR  - Z * (X  + Z)  (2) 

c.  Critical  Success  Index  (CSI),  (Donaldson,  1975), 
commonly  known  as  the  "threat  score,"  (Palmer 
and  Allen,  1949). 


CSI  - X i (X  + Y + Z) 


(3) 


We  verified  our  forecasts  over  areas  of  16  and  64  nmi^. 

The  16  nmi^  area  is  very  close  to  that  of  a grid-box  in  the 
D/RADEX  system  (Bigler,  et  al.,  1973;  Saffle,  1976).  The 
64  nml^  area  is  about  the  size  of  the  Washington,  D.C.  metro- 
politan area.  This  area  would  cover  the  approaches  to  most 
airports  as  well  as  the  terminal  Itself.  An  idea  of  the 
numbers  involved  in  computing  the  verification  scores  is 
illustrated  by  Table  2 which  lists  the  total  number  of  fore- 
casts (X  + Y + Z in  Table  1)  by  the  SRI  model  for  different 
threshold  intensities,  data  sequences,  forecast  projections, 
and  verification  areas. 


Table  2.  Total  number  of  forecasts  made  by  the  SRI  model  from  zero  tilt 
reflectivity  data,  veiifled  over  16  nmi^  and  64  nnl ^ , and  from  VTL  data, 

verlf led  nurr  16 




10-mln  Data 

30-aln  Dat« 

Reflectivity 

Forecast  Projection 

Forecast  Projection 

threshold 

(minutes) 

(minutes) 

10 

30 

60 

30 

60 

90 

16  nml 2 Verification  Area 

i 

*1373 

66486 

86390 

31287 

30102 

27641 

2 

33096 

58603 

58510 

20470 

19968 

19033 

3 

40334 

43382 

42848 

14839 

14888 

14132 

4 

24681 

27068 

27112 

8694 

8771 

8434 

64 

nml^  Verlf leaf  Ion 

i Area 

1 

323503 

330626 

131121 

*633  9 

45510 

29170 

2 

90807 

95726 

95763 

320*7 

31638 

30148 

3 

68941 

73498 

72496 

24565 

24842 

23722 

* 

44159 

*777* 

48493 

15389 

15688 

15328 

16 

nml**  Verification  6r**a 

i 

15361 

22138 

21752 

2 

64)2 

6348 

3*30 

) 

4393 

*527 

3849 

4 

3220 

3*07 

3018 

$ 

2657 

288* 

2*31 
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2.4  Results 

We  have  tested  the  three  forecast  models  using  both  10-  and 
30-minute  sequences  of  input  data.  We  used  the  SRI  and  LLS 
model  to  produce  10-,  30-,  and  60-minute  forecasts  from  10- 
minute  data  sequences,  and  30-,  60-,  and  90-minute  forecasts 
from  30-minute  sequences.  With  the  CCM  model,  we  made  fore- 
casts of  10- , 30- , and  60-minutes  from  10-minute  data  sequences, 
and  30-  and  60-minute  forecasts  from  30-minute  sequences. 


The  SRI  model  uses  two  criteria  to  identify  echoes  for  tracking 
purposes — a reflectivity  threshold  and  a significance  weight 
(SW)  threshold.  The  formula  for  the  latter  is: 


S.W.  = 100  log^Q  I / max 


[< 


ref 


0.5 

) f- 


,1/1.6 


Z^f1/1^ 


>] 
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where  Z * is  an  arbitrary  normalization  reflectivity  value  of 
100  mn^/mJ,  Zmax  is  the  reflectivity  corresponding  to  the 
highest  intensity  integer,  and  zl/1'6  is  the  Marshall-Palmer 
rainfall-rate  estimate  (1948).  Thus,  the  first  term  of  the 
significance  weight  accounts  for  the  maximum  reflectivity  with- 
in the  cell,  while  the  second  term,  which  is  a sum  of  Z values 
meeting  a predetermined  intensity  threshold,  is  a measure  of 
echo  size.  Both  factors  are  known  to  be  pertinent  to  the 
severity  of  convective  weather.  The  SRI  significance  weight 
thus  imposes  a selectivity  constraint  on  the  data  because  only 
echoes  which  are  either  intense  or  large,  or  both,  are  considered 
significant  enough  for  tracking.  As  an  illustration.  Figure  1 
gives  different  echo  configurations  having  a significance  weight 
of  150.  The  trade-off  between  maximum  intensity  within  the 
cell  and  the  size  of  the  cell  is  clearly  indicated.  Small  in- 
tense cells  are  selected  as  well  as  larger  less  Intense  cells. 

For  an  intensity  threshold  of  3 and  significance  weight  of  150, 
an  isolated  value  of  intensity  5 or  6 will  qualify  the  echo 
for  tracking.  In  comparison,  an  echo  intensity  level  3 must 
extend  over  9 grid  boxes  to  be  selected  for  tracking. 


333 

444 

5 

333 

333 

6 

443 

44 

33 

Figure  1.  Different  configurations  of  sero>degre*  reflectivities 
having  an  Intensity  threshold  of  3 and  an  IU  significance  weight 
threshold  of  150. 
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The  CCM  model  uses  only  a reflectivity  threshold,  and  is  thus 
less  selective. 


In  most  of  our  experiments  with  the  SRI  and  LLS  models,  we 
have  used  the  following  combinations  of  reflectivity  thresholds 
and  significance  weights:  (1,25),  (2,50),  (3,150)  and  (4,200). 

We  also  made  several  runs  with  these  models  using  only  a reflec- 
tivity threshold  to  make  the  results  more  strictly  comparable 
with  those  from  the  CCM  model.  Surprisingly,  results  were  very 
similar. 

Tables  3-5  list  the  results  of  our  tests  on  zero  degree  re- 
flectivities over  a predictand  area  of  16  nmi^.  Table  3,  relat- 
ing to  the  probability  of  detection  (POD)  shows  that  the  cross- 
correlation (CCM)  technique  is  superior  to  the  others  on  this 
score.  The  SRI  method  fares  worst  in  general.  The  CCM  technique 
also  has  the  highest  critical  success  index  (CSI)  except  when 
30-minute  forecasts  are  made  from  10-minute  data  sequences 
(Table  4).  The  SRI  method  scores  somewhat  higher  for  such  fore- 
casts. The  linear  least-square  extrapolation  (LLS)  technique 
shows  the  worst  CSI  scores.  The  SRI  method  redeems  itself  by 
its  low  false  alarm  ration  (FAR),  particularly  for  the  30-  and 
60-minute  forecasts  from  10-minute  data  sequences  (Table  5). 
However,  this  advantage  disappears  when  30-minute  data  sequences 
are  used. 


When  a 64  nmi^  predictand  area  is  used,  the  CCM  is  best  in  POD 
(Table  6),  it  also  has  the  highest  CSI  scores  for  forecasts  from 
both  10-minute  and  30-minute  data  sequences  (Table  7).  Its  FAR 
score  lags  behind  that  of  the  SRI  model  for  forecasts  made  from 
10-minute  data  input  (Table  8) . The  LLS  technique  is  again 
generally  outranked  by  the  other  techniques  on  all  scores. 


Results  are  somewhat  different  when  vertically-integrated  liquid- 
water  (VIL)  content  data  are  used  (Tables  9-11).  We  tested  only 
10-minute  input  data  sequences.  When  all  the  VIL  data  are  used 
in  the  computations,  the  CCM  model  is  still  the  best  performer. 
However,  when  only  high  VIL  values  are  considered,  the  LLS  model 
scores  are  the  highest. 


We  conclude  that  if  we  are  interested  in  60-minute  high  resolution 
forecasts  of  reflectivity  patterns  and  have  10-minute  data  sequences 
it  may  be  advantageous  to  use  the  SRI  technique  especially  if  we 
wish  to  keep  the  number  of  false  alarms  relatively  low.  However, 
this  would  require  a relatively  large  computer  capability  and  the 
forecasts  may  have  to  be  performed  in  a central  computer  facility. 
If,  on  the  other  hand,  we  prefer  a relatively  simple  technique 
which  may  be  implemented  locally  on  a mini-computer,  the  CCM 
method  recommends  itself,  ^specially  for  low-resolution  forecasts 
of  30-minutes  or  less.  For  tracking  VIL  patterns,  an  LLS  type 
model,  using  only  the  more  intensive  values,  is  recommended. 
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Table  3.  Values  of  the  cumulative  Probability  of  Detection  for 
forecasts  made  by  the  three  models  from  all  five  test  days, 
verified  over  a 16  nml^  grid  box  area. 


Zero 

Tilt 

Reflectivity  Data 

POD 

(10-mi n 

Input) 

POD 

(30-min 

Input) 

Model 

- Integer 

Forecast  Length 

Forecast  Length 

Threshold 

10 

30 

60 

10 

30 

60 

CCM 

1 

13 

62 

49 

62 

53 

1' 

2 

55 

43 

53 

47 

— 

3 

Kj 

51 

37 

53 

42 

4 

m 

42 

28 

46 

33 

— 

LLS 

1 

60 

37 

24 

50 

35 

27 

2 

60 

42 

26 

40 

27 

19 

3 

57 

35 

23 

31 

19 

14 

4 

54 

32 

19 

30 

16 

11 

SRI 

1 

75 

42 

28 

36 

19 

12 

2 

68 

37 

23 

41 

29 

21 

3 

63 

32 

20 

31 

18 

12 

4 

60 

28 

16 

29 

16 

11 

Table  4.  Values  of  the  cumulative  Critical  Success  Index  for 


forecasts  made  by  the  three  models  from  all  five  test  days, 
verified  over  a 16  nml2  grid  box  area. 


Zero  Tilt 
Reflectivity  Data 

ll  — ■ . 

CS1 

(10-mln 

Input) 

CSI 

(30-mln 

Input) 

Model 

- Integer 

Forecast  Length 

Forecast  Length 

Threshold 

10 

30 

60 

10 

30 

60 

CCM 

.1 

62 

42 

28 

31 

2 

58 

36 

24 

26 

— 

3 

55 

32 

20 

23 

— 

4 

50 

26 

15 

17 

** — 

LLS 

1 

46 

26 

17 

35 

23 

17 

2 

47 

31 

18 

28 

18 

12 

3 

44 

25 

16 

21 

12 

8 

4 

41 

22 

12 

20 

10 

6 

SRI 

1 

62 

42 

27 

11 

8 

2 

57 

37 

23 

15 

3 

53 

32 

20 

8 

4 

49 

28 

16 

ii 
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Table  5.  Values  of  Che  cumulative  False  Alarm  Ratio  for  forecasts 
made  bv  the  three  models  from  all  five  test  days,  verified  over  a 
16  nml‘  grid  box  area. 


Zero 

Tilt 

Reflectivity  Data 

FAR 

(10-ml 

n Input) 

FAR 

(30-ml 

n Input) 

Model 

- Integer 

Forecast 

Length 

Forecast 

Length 

Threshold 

10 

30 

60 

10 

30 

60 

CCM 

. 1 

24 

44 

60 

45 

58 

„ 

2 

27 

SO 

65 

53 

63 

— 

3 

30 

54 

69 

53 

67 

— 

4 

33 

60 

76 

59 

73 

— 

LLS 

1 

34 

52 

63 

45 

49 

67 

2 

31 

48 

61 

52 

65 

75 

3 

33 

54 

66 

60 

75 

82 

4 

36 

59 

72 

63 

81 

89 

SRI 

1 

21 

34 

48 

57 

71 

77 

2 

21 

38 

52 

47 

57 

64 

3 

24 

43 

57 

56 

71 

78 

4 

27 

49 

66 

58 

74 

82 

Table  6.  Values  of  the  cumulative  Probability  of  Detection  for 
forecasts  made  by  the  three  models  from  all  five  test  days,  verified 
over  a 64  nml2  grid  box  area. 


Zero 

Tilt 

Reflectivity  Data 

POD 

(10-mln  Input) 

POD 

(30-min  Input) 

Modal 

- Integer 

Forecast  Length 

Forecast 

Length 

Threshold 

10 

30 

60 

10 

30 

60 

CCM 

.1 

83 

69 

57 

69 

60 

2 

80 

65 

52 

64 

55 

— 

3 

77 

61 

47 

61 

50 

— 

4 

75 

53 

38 

59 

43 

— 

LLS 

1 

80 

53 

36 

59 

44 

34 

2 

70 

49 

36 

48 

35 

25 

3 

64 

43 

30 

39 

27 

20 

4 

60 

40 

27 

. 

39 

24 

17 

SRI 

1 

75 

49 

34 

45 

26 

12 

2 

68 

43 

29 

47 

34 

26 

3 

61 

37 

25 

37 

40 

17 

4 

58 

34 

20 

35 

22 

15 

< 


Table  7.  Values  of  the  cumulative  Critical  Success  Index  for 
forecasts  made  by  the  three  models  from  all  flvs  test  days,  verified 
over  a 64  nml?  grid  box  area. 


Zero 

Tilt 

Reflectivity  Data 

CSI 

(10-mln  Input) 

CSI 

(30-ml 

n input) 

Model 

- Integer 

Forecast  Length 

Forecast 

Length 

Threshold 

10 

30 

60 

10 

30 

60 

cot 

i 

50 

35 

49 

37 

2 

46 

31 

44 

33 



3 

41 

27 

40 

29 



4 

36 

22 

39 

24 

— 

LLS 

1 

62 

37 

25 

44 

30 

23 

2 

59 

42 

26 

40 

24 

17 

3 

52 

35 

22 

31 

18 

13 

4 

49 

32 

19 

30 

15 

10 

SRI 

1 

69 

49 

34 

32 

18 

12 

2 

62 

43 

28 

36 

26 

26 

3 

56 

37 

25 

28 

18 

17 

4 1 

53 

34 

20 

27 

16 

15 

Table  8.  Values  of  the  cumulative  False  Alarm  Ratio  for  forecasts 
made  by  the  three  models  from  all  five  test  days,  verified  over  a 
64  nmi^  grid  box  area. 


Zero 

Tilt 

Reflectivity  Data 

FAR 

(10-min  Input) 

FAR 

(30-mln 

Input) 

Model 

- Integer 

Forecast 

Length 

Forecast  Length 

Threshold 

10 

30 

60 

10 

30 

60 

CCM 

1 

20 

36 

52 

37 

50 

„ 

2 

27 

40 

56 

41 

55 

— 

3 

30 

44 

60 

45 

59 

— 

4 

33 

49 

66 

47 

64 

— 

LLS 

1 

22 

40 

54 

36 

50 

58 

2 

20 

38 

51 

41 

55 

65 

3 

22 

41 

55 

47 

64 

73 

4 

24 

45 

60 

50 

71 

81 

SRI 

1 

16 

29 

41 

47 

63 

77 

2 

16 

30 

43 

39 

49 

55 

3 

17 

34 

46 

46 

61 

69 

4 

19 

38 

55 

45 

62 

72 

12 


Table  11.  Values  of  the  cumulative  Falsa  Alarm  Ratio  for 


forecasts  made  by  the  three  models  from  all  five  teat  days, 
verified  over  a 16  nml2  grid  box  area. 


VIL  Data 

FAR 

(10-mln  Input) 

Model 

- Integer  Threshold 

Forecast 

Length 

10 

30 

60 

CCM 

i 

27 

49 

65 

2 

31 

58 

72 

3 

33 

61 

76 

A 

36 

62 

78 

S 

37 

63 

77 

LLS 

1 

73 

69 

83 

2 

35 

56 

67 

3 

34 

57 

67 

A 

29 

51 

67 

5 

30 

54 

74 

SRI 

1 

49 

71 

80 

2 

32 

53 

64 

3 

34 

57 

71 

A 

34 

60 

81 

5 

40 

71 

88 

2.5  Future  Plans 

We  are  concentrating  our  future  effort  on  two  main  tasks — determining 
the  relationship  between  severe  weather  events  and  parameters 
derived  from  digital  weather  radar,  and  developing  a short  range 
forecast  of  the  probability  of  echoes  of  a predetermined  intensity 
occurring  at  each  grid  box  over  the  radarscope  area. 

Archived  digital  radar  data  will  be  used  to  develop  cartesian 
maps  of  zero  tilt  reflectivity  and  VIL  data.  Cells  will  be  defined 
objectively  from  these  data.  Parameters  will  be  defined  for  each 
cell  and  these  parameters  will  be  related  statistically  to  the 
occurrence  of  severe  weather  in  the  cell.  An  operational  mode 
would  entail  defining  the  cells  objectively,  computing  the  necessary 
parameters  for  each  cell,  and  applying  the  correct  equation  to 
determine  the  probability  of  severe  weather  associated  with  that 
cell. 

Similarly  we  shall  use  archived  D/RADEX  data  to  forecast  the  prob- 
ability of  occurrence  of  radar  echoes  of  predetermined  intensities 
using,  as  predictors,  forecasts  of  echo  movements  by  appropriate 
tracking  models,  and  previous  radar  observations,  and  trends. 

Other  potential  predictors  to  be  considered  are  VIL  data,  echo 
tops,  tropopause  penetration,  and  various  synoptic  meteorological 
and  satellite  data.  Eventual  Implementation  will  result  in  maps 
of  forecast  probabilities  of  echoes  of  predetermined  intensity 
over  the  entire  grid.  These  should  prove  applicable  for  Air 
Traffic  Control. 
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SHORT  RANGE  (2-6  hr)  FORECASTS 

As  mentioned  earlier,  this  task  was  initiated  to  satisfy  the  needs 
of  flow  controllers  who  need  forecasts  of  several  hours  over  a com- 
paratively large  area  to  enable  them  to  divert  air  traffic  threatened 
by  Inclement  weather  to  safe  terminals. 

In  our  interim  report  (Alaka,  et  al.,  1975)  we  discussed  a preliminary 
prediction  scheme  which  showed  a good  deal  of  promise.  We  have  since 
expanded  and  improved  the  developmental  procedure.  We  have  in  effect: 

(1)  added  new  predictors  derived  from  new  data  sources,  (2)  increased 
the  size  of  the  developmental  data  sample,  and  (3)  expanded  the  geo- 
graphical domain  of  the  forecasts. 

We  are  now  transmitting  to  the  field,  three  times  daily,  computer 
produced  2-6  hr  probabilities  of  thunderstorms.  The  forecasts  span 
the  period  1700-0300  GMT,  which  is  the  period  of  maximum  diurnal  fre- 
quency of  thunderstorms. 

3.1  Development  of  prediction  equations 

Our  prediction  scheme  continues  to  be  based  on  a combination 
of  classical  statistical  (Klein,  1970)  and  model  output 
statistics  (MOS)  approaches  (Glahn  and  Lowry,  1972).  We  have 
derived  separate  prediction  equations  for  each  of  the  periods 
1700-2100,  2000-0000,  and  2300-0300  GMT.  The  developmental 
data  sample  was  derived  from  the  spring  season  (mid-March  to 
mid-June)  of  1974  and  1975.  Because  of  the  smallness  of  this 
sample,  we  used  the  generalized  operator  approach  over  the 
entire  area  delineated  in  Figure  2. 

3.1.1  Thunderstorm  predictands 

In  developing  our  statistical  sample,  we  have  determined 
thunderstorm  occurrence  from  manually  digitized  radar 
(MDR)  data.  Radar  echo  Intensities  and  coverage  are 
digitized  over  square  areas  40-45  nmi  on  a side  in 
accordance  with  ten-digit  code  from  0-9  (Moore,  Cummings, 
and  Smith,  1974).  A thunderstorm  is  assumed  to  occur 
within  an  MDR  square  whenever  the  MDR  code  for  that 
square  equals  or  exceeds  4 during  the  4-hr  forecast 
periods  mentioned  above.  This  assumption  is  reasonable 
in  view  of  the  results  obtained  by  Mogil  (1974)  and 
Reap  and  Foster  (1975).  Altogether,  there  are  571  MDR 
squares  over  the  predictand  area  depicted  in  Figure  2. 

The  predictand  is  assigned  a value  of  1 for  a thunder- 
storm occurrence  and  0 otherwise. 

3.1.2  Thunderstorm  predictors 

Potential  predictors  were  developed  from  four  data 
sources,  namely,  basic  hourly  surface  meteorological 
observations,  forecasts  of  basic  upper  air  variables 
by  the  Limited-area  Fine  Mesh  model  (LFM)  of  NMC 


ri|.  2.  The  Itninl  thundaratorn  pradictanda  vara  daflnad  within  that  region 
of  tha  U.  S.  ancloaad  bp  tha  haavy  Una.  Tha  Individual  pradlctand 
boxaa  corraapond  to  tha  MDR  aquaraa  ahown;  pradietora  ara  avaluated 
at  tha  cantara  of  thaaa  boxaa. 

(Howcroft  and  Desmarais,  1971),  MDR  reports,  and  the 
climatological  relative  frequency  of  the  predictand. 

The  valid  times  of  the  various  data  sources  relative 
to  the  forecast  period  are  shown  in  Figure  3.  Thus 
the  MDR  observations  are  2 1/2  hr  earlier  than  the 

IM  (GMT) 


FORECAST 

TRANSMITTED 


VALID  PERIOD  OF  PRSDICTAM) 


OSS  SPC 

DATA 


LFM  FCST  PRSDICTAM) 
PROM  13-GMT  RELATIVE 
CYCLE  FREQUENCY 


Typaa  of  Input  data  and  thalr  valid  tlaaa  ralatlva  to  tha 
valid  parlod  of  tha  pradlctand  for  tha  1100-CMT  foracaat 
opola.  w 


beginning  of  the  forecast  period,  the  observed  surface 
data  are  5 and  2 hr  earlier,  and  the  LFM  forecasts 
are  valid  1 hr  afterwards.  The  potential  predictors 
are  evaluated  at  the  centers  of  the  predictand  boxes. 
Therefore  the  surface  observations  and  LFM  forecasts 
had  to  be  interpolated  to  these  points.  In  the  case 
of  the  surface  observations,  this  was  done  by  analyz- 
ing the  data  in  accordance  with  a successive  approxi- 
mation scheme  (Cressman,  1959).  LFM  forecasts  were 
evaluated  by  standard  interpolation  from  the  LFM 
grid  to  the  centers  of  MDR  boxes . 


The  potential  predictors  are  listed  in  Table  12  which 
also  shows  the  data  sources  from  which  they  were  ob- 
tained. The  list  is  mostly  self-explanatory,  but 
the  following  instability  indices,  well-known  for 
their  pertinence  to  thunderstorm  prediction,  require 
definition: 


a.  The  K-index  (George,  1960) 

* - « + Td>850  - (T  - Id>700  - T500  <5) 

where  T denotes  the  temperature,  Td  the  dew- 
point and  the  suffixes  indicate  the  level. 


b.  The  modified  K-index,  K' , where 

K'  “ — <Tsfc  + T850  + Tdsfc  + Td850) 

* (T  " Td) 700  " T500  <6> 

c.  The  Total  Totals  (TT)  index  (Miller,  1972) 
where 

n - (T  + Id)850  - 2T500  (7) 

d.  The  modified  Total  Totals  (TT) ' 

TT'  ' ‘Vc  + T850  + Tdsfc  + Td850> 

- 2I500  (8) 

e.  The  Showalter  index  (SI)  (Showalter,  1953) 


SI 


l500 


- T* 


(9) 


where  T*  is  the  predicted  temperature  of  an 
air  parcel  lifted  dry  adiabatically  from  850  mb 
to  saturation,  then  moist  adiabatically  to  500  mb. 
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f.  The  modified  Showalter  index 

SI'  - T500  - T*'  (10) 

where  T*'  is  the  observed  temperature  of  an 
air  parcel  lifted  dry  adiabatically  from  the 
surface  to  its  saturation,  then  moist  adia- 
batically to  500  mb. 


Table  12.  Potential  thunderstorm  predictor  variables. 

The  data  sourtt(s) 

for  c-ach  variable  la  given  In  the  right  hand 

column . 

Variable 

Data  Sourcc(s) 

1. 

Sfc  u-componcnt 

Sfc  obs 

2. 

Sfc  v-componcnt 

Sfc  obs 

3. 

BL  u-cowpcmint 

LFM 

4. 

BL  v-component 

LFM 

5. 

850-mb  u-component 

LFM 

6. 

850-mb  v-c  opponent 

LFM 

7. 

500-mb  u -component 

LFM 

8. 

500-tnb  v-conponent 

LFM 

9. 

MSL  pressure 

Sfc  obs 

10. 

700-mb  vortical  velocity 

LKM 

11. 

Sfc  mixing  ratio 

Sfc  obs 

12. 

850-mb  mixing  ratio 

LFM 

13. 

700-mb  mixing  ratio 

LFM 

14. 

850-  to  500- n»b  mean  mixing  ratio 

LFM 

15. 

850-  to  500-mb  mean  temp. -dev  point 

l.FM 

16. 

Sfc  equiv.  pot.  temp. 

Sfc  obs 

17. 

Sfc  equiv.  put.  temp,  x horltontal  gradient  of 

efc  equiv.  pot.  temp. 

Sfc  obs 

18. 

850-mb  equiv.  pot.  temp. 

LFM 

19. 

700-mb  equiv.  pnt . temp. 

LFM 

20. 

850-  to  700-nib  mean  equiv.  pot.  temp. 

LFM 

21. 

(Sfc  roiuus  700  mb)  equiv.  pot.  temp. 

Sfc  obs  + LFM 

22. 

(850  mb  minus  700  mb)  equiv.  pot.  temp. 

LFM 

23. 

Sfc  equiv.  pot.  temp,  advccllon 

Sfc  obs 

24. 

850-mb  equiv,  pot.  temp,  advert  ion 

LFM 

25. 

700-mb  equiv,  pot.  temp,  ad eve t Ion 

LFM 

26. 

Sfc  moisture  divergence 

Sfc  obs 

27. 

BL  moisture  divergence 

Sfc  obs  + LFM 

28. 

850-mb  moisture  divergence 

LFM 

29. 

K index 

LFM 

30. 

Modified  K index 

Sfc  obs  + LKM 

31. 

Total  Totals  index 

LFM 

32. 

Modified  Total  Totals  index 

Sfc  oba  + LFM 

33. 

Showalter  index 

LFM 

34. 

Modified  Showalter  index 

Sfc  obs  4 LKM 

35. 

500-mb  wind  speed 

LFM 

36. 

Mag.  of  850-  to  500-mb  wind  shear 

LFM 

37. 

Signed  mag.  of  850-  to  500-mb  500-mb  wind  shear 

LFM 

38. 

(500  mb  minus  650  mb)  wind  direction. 

LKM 

39. 

BL  vort icl ty 

LFM 

40. 

500-mb  vorticlty 

LFM 

41. 

500-mb  vorticlty  advectlon 

l.FM 

42. 

Three-hr  MSL  pressure  change 

Sfc  obs 

43. 

MDK  variables  (see  Fig.  5) 

MDR  data 

44. 

Predictand  relative  frequency 

Predictand  data 

Each  of  the  variables  computed  in  Table  12  was  smoothed 
to  remove  wavelengths  of  4 grid- lengths  and  less.  A 
five-point  hanning  filter  (Shuman,  1957)  was  used. 

3.1.3  Optimum  predictor  locations 

Thunderstorms  occur  on  a scale  considerably  smaller  than 
that  which  the  regularly  available  conventional  obser- 
vations and  numerical  forecasts  can  resolve.  An  ex- 
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ception  is  the  weather  radar  which  is  capable  of  de- 
picting smaller  scale  features  associated  with  these 
phenomena.  Unfortunately,  current  operational  radar 
information  is  largely  qualitative,  and  the  manual 
digitization  of  radar  reflectivities  is  an  attempt  to 
quantize  this  information.  The  MDR  data  we  are  using 
provide,  in  effect,  the  only  predictor  input  which  is 
even  remotely  comparable  in  scale  to  the  predictand. 

Our  task,  then,  is  to  predict  the  small-scale  thunder- 
storms from  data  which  are  essentially  large  (synoptic) 
scale.  It  is  a common  experience  by  those  familiar 
with  synoptic  maps  that  the  meteorological  fields  de- 
picted by  these  maps  form  recognizable  patterns  of 
pressure,  wind,  temperature,  and  humidity  or  of  other 
parameters  derived  from  them.  It  is  also  a common  ex- 
perience that  these  patterns,  singly  or  in  combination, 
are  precursors  or  concomitants  of  different  weather 
phenomena.  However,  neither  do  the  salient  character- 
istics of  these  patterns  necessarily  coincide  with 
one  another  in  time  and  space,  nor  do  they  usually 
coincide  with  the  location  of  the  weather  with  which 
they  are  associated.  As  an  example,  while  it  is  known 
that  low  pressure  is  associated  with  rainy  weather,  it 
does  not  follow  that  the  rainiest  weather  occurs  where 
the  pressure  is  lowest.  Thus,  a most  effective  use  of 
the  predictors  in  Table  12  is  not  compatible  with  the 
assumption  that  the  predictors  are  collocated  with  the 
predictand.  An  effort  must  therefore  be  made  to  determine 
the  optimum  location  of  the  different  potential  predictors 
relative  to  that  of  the  predictand.  This  is  especially 
important  in  the  present  case  since  most  of  our  predictors 
are  based  on  observations  made  several  hours  before  pre- 
dictand time. 

We  have  devised  a scheme  to  determine  the  optimum  position 
of  each  potential  predictor.  We  defined  the  best  position 
as  that  which  yields  the  highest  linear  correlation  with 
the  predictand  from  among  30  points  surrounding  the  pre- 
dictand box.  The  best  positions  relative  to  a predictand  box 
of  10  potential  predictors  are  illustrated  in  Figure  4. 

Because  fields  of  radar  echoes  are  highly  discontinuous, 
we  decided  to  offer,  as  potential  predictors,  MDR  values 
for  a relatively  large  number  of  boxes  surrounding  the 
predictand  box.  To  determine  which  MDR  boxes  were  most 
useful,  we  ran  a special  screening  regression  in  which 
30  condldate  boxes  were  offered.  The  first  7 boxes 
selected  (Fig.  5)  were  chosen  as  the  potential  predictors 
to  be  used  in  combination  with  the  other  variables  in 
Table  12. 
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H gure  4“""*The  P's  denote  grid  point  positions  of  different  predictor  vari- 
ables relative  to  the  predlctand  box.  The  position  of  each  vari- 
able was  determined  from  the  field  of  space-lagged  linear  correla- 
tion coefficients  where  the  correlation  is  between  that  variable 
and  the  predlctand;  the  highest  correlation  specified  the  position 
or  offset  of  the  variable  relative  to  the  predlctand  box  (shaded). 
The  subscripts  to  the  P's  are  in  the  order  of  decreasing  magnitude 
of  the  linear  correlation  coefficients.  Each  variable  is  identi- 
fied below;  the  trailing  information  In  parentheses  gives  the  sign 
of  the  correlation  coefficient  and  the  data  from  which  the  variable 
was  computed: 


P.  - Modified  K index  (♦;  Sfc  obs  + LFM) 

?2  - Showalter  index  (-;  LFM) 

Pj  - Modified  Total  Totals  index  (+;  Sfc  obs  + LFM) 

P^  - 850-mb  mixing  ratio  (♦;  LFM) 

Pj  - 850-  to  500-mb  mean  mixing  ratio  (+;  LFM) 

P^  - Sfc  moisture  divergence  (-;  Sfc  data) 

Pj  - Equiv.  pot.  temp,  x horizontal  gradient  of  equlv.  pot. 
temp  (♦;  Sfc  data) 

Pg  - Sfc  equlv.  pot.  temp,  advectlon  (4-;  Sfc  data) 

P9  - 500-  mb  v-eomponent  (♦;  LFM) 

P10“  500_Bb  zpecd  (-;  LFM) 


3.1.4  Linearizing  predictor-predictand  relationships 

The  screening  regression  technique,  used  to  derive  the 
prediction  equations,  relates  the  predlctand  to  a weighted 
linear  combination  of  the  predictor  variables.  However, 
in  general  the  predictor-predictand  relationships  are 
in  fact  non-linear,  except  within  limited  ranges,  or  in 
segments  (Alaka,  et.  al.,  1973).  To  deal  with  this  problem, 
predictors  are  usually  either  truncated  to  exclude  the 
. range  where  their  relationships  with  predlctand  frequency 
are  highly  non-linear,  or  their  total  range  is  divided 
into  short  segments  .with  each  segment  being  treated  as 
a binary  predictor  (Miller,  1964).  The  drawbacks  qf  this 
procedure  are:  (1)  lack  of  a priori  guidance  in  defining 
the  binary  limits,  (2)  the  total  number  of  predictors  can 
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Fig.  5.  Potential  predictors  defined  from  MDR  data.  The  subscripts 
denote  the  order  of  selection  of  seven  MDR  predictor  boxes 
In  a special  screening  regression  run  wherein  30  MDR  boxes 
surrounding  the  predlctand  box  were  offered.  The  predictor 
boxes  selected  for  Inclusion  Into  the  1800  GMT  equation  are 
Indicated  by  asterisks.  The  predlctand  box  Is  Indicated  by 
shading. 

become  exceedingly  large,  and  (3)  there  is  no  assurance 
that  the  predictive  information  of  the  predictor  over 
its  entire  range  is  completely  utilized. 

Our  procedure  to  deal  with  the  non-linearity  between  each 
predictor  and  predictand  involved  computing  the  predictand 
frequency  for  short  intervals  over  the  range  of  the  poten- 
tial predictor  values  (Charba,  1977a).  The  predictand 
frequency  for  each  value  of  the  predictor  was  subsequently 
obtained  by  linear  interpolation  between  computed  values. 

Among  the  predictors  in  Table  12  those  derived  from  sur- 
face observations  and  from  LFM  output  were  linearized  in 
this  manner.  A possible  disadvantage  of  this  technique 
may  be  in  "overfitting"  the  dependent  sample,  which  is 
especially  dangerous  when  the  sample  is  short. 

3.2  Thunderstorm  probabilities 

We  developed  two  separate  thunderstorm  probability  equations 
for  each  predictand  period.  In  the  first,  the  "primary" 
equation,  all  the  variables  in  Table  12  were  offered  for 
screening  as  potential  predictors.  Those  selected  for  the 
1800  GMT  equation  are  shown  in  Table  13  in  the  order  of  their 
selection.  This  table  also  shows  the  data  sources  for  each 
predictor  and  the  cumulative  reduction  of  variance  with  each 
added  predictor.lt  is  noteworthy  that  the  cumulative  reduction 
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Table  13- -Predictor  variable?;  selected  for  Inclusion  In  the  1800  GMT  primary 
thunderstorm  regression  equation.  The  variables  are  listed  in  the 
order  they  were  selected;  the  cumulative  reduction  of  variance 
with  each  additional  term  Is  given  in  the  right  hand  column.  The 
data  from  whj  ch  each  variable  wa  s c om  Touted  is  shown  in  ja  renthes  i s^_ 

Cumulative  Reduction 


Var  1 able 

of  Variance  (2) 

1. 

Modified  K index  (Sfc  obs  + LFM) 

18.7 

2. 

MDRj  (MDR) 

20.9 

3. 

Modified  Total  Totals  index  (Sfc  obs  + LFM) 

22.3 

4. 

MDR2  (MDR) 

23.6 

5. 

Predictand  relative  frequency  (Predictand  data) 

24.2 

6. 

500-mb  wind  speed  (LFM) 

24.9 

7. 

Sfc  moisture  divergence  (Sfc  obs) 

25.5 

8. 

850-mb  equiv.  pot.  temp.  (LFM) 

25.9 

9. 

Sfc  equiv.  pot.  temp,  advoction  (Sfc  obs) 

26.3 

10. 

MDR 5 (MDR) 

26.6 

11. 

MDR 4 (MDR) 

26.9 

12. 

MDR3  (MDR) 

27.2 

13. 

Modified  Shows Iter  index  (Sfc  obs  + LFM) 

27.4 

14. 

Sfc  mixing  ratio  (Sfc  obs) 

27.7 

15. 

Sfc  equiv.  pot.  temp.  (Sfc  obs) 

28.0 

16. 

500-mb  v-coraponent  (LFM) 

28.2 

of  variance  increased  appreciably  out  to  16  terms  (see 
Charba,  1977a).  Also  noteworthy  is  that  all  except  three 
of  these  predictors  involve  surface  observations  or  MDR 
data. 

The  second  probability  equation,  the  "backup"  equation, 
was  developed  without  the  MDR  predictors.  The  backup 
equations  are  used  operationally  to  produce  forecasts 
at  grid  points  where  MDR  data  are  missing  or  unavailable. 
Table  14  lists  the  predictors  in  the  order  of  their 
selection.  Note  the  considerable  rearrangement  of  the 


Table  1* — Same  as  Table  13  for  the  1800  GMT  "backup"  thunderstorm  regression 
equation. 


Variable 

Cumulative  Reduction 
of  Variance  (2) 

1. 

Modified  K index  (Sfc  obs  + LFM) 

18.5 

2. 

Sfc  moisture  divergence  (Sfc  obs) 

19.8 

3. 

Predictand  relative  frequency  (Predictand  data) 

20.6 

4. 

Modified  Total  Totals  index — binary  (Sfc  obs  + LFM) 

21.3 

5. 

500-mb  wind  speed  (LFM) 

22.0 

6. 

Sfc  equiv.  pot.  temp,  advectlon  (Sfc  obs) 

22.4 

7. 

850-  to  500-mb  mean  mixing  ratio  (LFM) 

22.8 

8. 

500-mb  v-component--binary  (LFM) 

23.1 

9. 

Shovalter  index  (LFM) 

23.4 

10. 

850-i»b  mixing  ratio  (LFM) 

23.6 

11. 

Sfc  equiv.  pot.  temp,  x horizontal  gradient  of  equiv 
pot.  temp.  (Sfc  oba) 

23.8 

12. 

Modified  Total  Totala  index — binary  (Sfc  obe  + LFM) 

24.0 

13. 

850-mb  equiv.  pot.  temp.  . 

24.1 

14. 

Sfc  mixing  ratlo--blnary  (Sfc  oba) 

24.3 

15. 

Modified  Shovalter  Index  (Sfc  oba  + LFM) 

24.5 
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non-MDR  predictors  and  the  selection  of  some  predictors 
in  binary  form.  As  in  the  primary  equation,  observed 
surface  data  (including  predictand  frequtncy)  have  the 
greatest  input  to  the  equation. 

Analogous  probability  equations  were  also  developed  for 
1500  and  2100  (JIT.  Their  total  reduction  of  variance 
and  predictand  frequency,  together  with  those  for  1800 
GMT  are  given  in  Table  15. 


Table  15.  Reduction  of  varlanca  and  predictand  frequency  associated  with  the 
. primary  thunderstorm  equation  for  each  of  the  three  tinea 

Predictor 

Reduction  of 

Predictand 

Ties  (GMT) 

Varlanca  (Z) 

Frequency  (2) 

1500 

24.0 

8.4 

1800 

28.2 

10.1 

2100 

25.8 

8.6 

3.2.1  Forecast  verification 

We  verified  the  1800  GMT  forecasts  for  2000-0000  GMT  trans- 
mitted operationally  during  the  spring  season  of  1976.  The 
1500  and  2100  GMT  forecasts  were  not  verified  since  there 
was  little  reason  to  believe  that  results  for  these  times 
would  be  substantially  different.  We  tested  the  fore- 
casts for  their  bias  and  for  their  skill  relative  to  both 
climatology  and  persistence. 


We  defined  the  bias  as  the  sum  of  all  forecast  probabilities 
divided  by  the  sum  of  the  observations.  Unbiased  fore- 
casts would  therefore  have  a value  of  one.  The  score  used 
to  evaluate  the  skill  of  the  probabilities  is  based  upon 
a quantity  which  is  one-half  the  score  defined  by  Brier 
(1950).  This  quantity,  P,  is, 

N 

P - £ (F  - O,)2  (10) 

N i-1 


where  F is  the  probability  estimate  and  0^  is  the  obser- 
ved event  for  case  i.  If  Pp  and  Pc  are  the  P-values  of 
the  probability  and  climatic  frequency  forecasts,  respect- 
ively, then  the  skill  score,  SS,  is  defined  as: 


SS  - 


PC  " 


X 100. 


(11) 


Thus,  SS  is  the  percentage  of  improvement  in  P of  the 
probabilities  over  that  of  climatic  frequency  forecasts. 
A positive  score  would  mean  that  the  probabilities  are 
superior  to  climatology,  a negative  score  would  mean  the 
opposite. 
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The  best  climatological  frequency  available  for  this 
verification  was  the  predictand  relative  frequency  for 
Individual  grid  boxes  computed  from  the  dependent  sample. 
Since  this  quantity  was  also  selected  as  a predictor, 
the  skill  score  actually  measures  the  forecasting  skill 
of  the  other  predictors. 

The  independent  sample  from  which  the  verification 
statistics  were  compiled,  ran  from  31  March  to  14  June 
1976.  With  probabilities  at  all  571  forecast  points 
combined,  the  total  number  of  events  was  33,012.  It 
should  be  noted,  however,  that  these  events  were  not 
entirely  independent  since  an  appreciable  correlation 
between  these  events  exists  from  day  to  day  and  from 
one  grid  point  to  another. 

Results  showed  a forecast  bias  of  1.2  indicating  an  over- 
all slight  tendency  to  overforecast  thunderstorm  occurrences. 
The  skill  score  showed  that  the  operational  forecasts  were 
22  percent  better  than  climatological  frequency  forecasts. 

A desirable  characteristic  of  probability  forecasts  is 
' that  they  be  reliable.  That  is,  predicted  probability 

should  be  as  close  as  possible  to  the  observed  relative 
frequency  (Sanders,  1967).  A plot  of  the  reliability  of  the 
operational  forecasts  for  5 percent  intervals  is  shown 
. in  Figure  6.  The  plot  shows  that  the  forecast  probabilities 

closely  agreed  with  the  observed  frequencies  except  in 
the  85  to  100  percent  range. 


ri(.  t.  Reliability  of  tin  probability  faracaata  at  1100  GMT  within  51  Intervals. 
Perfect  reliability  la  Indicated  by  the  laahai  line. 
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Since  these  probability  forecasts  project  out  only  to 
6 hr  In  the  future,  one  might  expect  persistence  to  have 
skill.  Out  tests  showed  that  persistence  fared  very 
poorly  In  this  independent  sample,  the  skill  score  being 
Inferior  to  climatology  by  about  15  percent.  This  poor 
performance  of  persistence  is  probably  due  to  the  short 
lifetimes  of  thunderstorms  and  the  diurnal  variation  in 
their  frequency  (Wallace,  1975). 

Another  way  of  gaining  an  insight  into  the  performance 
of  the  forecasts  is  through  a series  of  case  studies. 

Out  of  seven  days  picked  at  random  (except  for  the  re- 
quirement that  thunderstorms  must  have  occurred  somewhere 
in  the  forecast  domain),  we  chose  three  cases  for  verifi- 
cation. The  main  factor  determining  our  choices  was  the 
availability  of  good  verifying  MDR  data  in  the  areas  where 
the  thunderstorms  occurred.  Thus,  the  forecasts  on  the 
three  days  selected  were  not  necessarily  the  best  of  the 
original  seven. 

The  probability  forecasts  and  the  corresponding  verifi- 
cations for  the  three  cases  are  shown  in  Figures  7,  8, 
and  9.  A thunderstorm  (or  predlctand)  occurrence  was 
defined  in  the  same  way  it  was  in  the  developmental 
sample.  Therefore,  where  a T appears  the  observed  pre- 
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dlctand  was  actually  1;  where  T's  do  not  appear  the 
predlctand  was  0.  Areas  where  the  verifying  MDR  data 
were  not  available  are  delineated  by  dotted  lines  and 
should  be  ignored. 

Figures  7,  8,  and  9 illustrate  that  the  general  pattern 
of  thunderstorm  occurrence  agrees  well  with  the  envelope 
of  higher  probabilities.  On  the  other  hand,  the  smaller- 
scale  probability  maxima  and  minima  do  not  verify  con- 
sistently although  they  appear  to  be  correct  at  least  as 
often  as  incorrect.  Another  apparent  weakness  exhibited 
by  these  cases  is  that,  when  thunderstorms  occurred  in  both 
the  Midwest  and  along  the  Gulf  Coast,  the  probabilities 
were  higher  in  the  latter  region.  This  deficiency  is 
likely  a consequence  of  deriving  and  subsequently  applying 
a single  prediction  equation  for  the  entire  forecasting 
area.  Predictor  coefficients  and  predictor  variables 
appropriate  for  the  Gulf  Coastal  states  are  not  likely 
to  be  the  same  as  for  the  upper  Midwestern  states.  There- 
fore, a single  equation  applied  to  both  regions  is  likely  to 
result  in  forecast  errors  in  both  regions.  Investigation 
of  this  problem  is  currently  in  progress. 

3.3  Future  Plans 

Our  plans  call  for  additional  study  in  several  areas 
which  would  likely  result  in  improvements  to  the  current 
forecasts.  The  following  are  some  of  these  areas: 

a.  Better  use  could  be  made  of  the  MDR  data.  Work,  now 
in  progress,  shows  that  the  frequency  of  thunderstorm 
occurrence  (as  determined  from  MDR  data)  is  highly 
correlated  with  the  distances  of  the  MDR  boxes  from 
the  radar  stations.  This  distance  dependence  is  due 
to  poor  detection  of  precipitation  cells  by  radar  at 
large  ranges  from  the  station.  We  plan  to  incorporate 
procedures  which  would  properly  screen  poor  quality 
data  from  the  dependent  and  independent  samples. 

b.  There  are  differences  in  predictor/predictand  relation- 
ships from  one  geographical  region  of  the  grid  to 
another.  These  differences  are  due  mainly  to  the 
proximity  of  mountain  ranges,  large  bodies  of  water, 
and  latitude.  Charba  (1977b)  has  successfully  accounted 
for  such  differences  in  objective  severe  local  storms 
forecasting.  However,  the  current  thunderstorm  fore- 
casting method  does  not  account  for  them  and  this  is 
believed  to  be  the  cause  of  geographical  biases  in 

the  probabilities  as  noted  in  the  discussion  of  the 
case  studies.  We  plan  to  incorporate  techniques  which 
would  alleviate  this  problem. 

c.  Another  area  of  investigation  which  should  result  in  a 
significant  improvement  concerns  the  development  of  a 
better  method  of  positioning  predictors  relative  to  the 
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predictand  box.  In  the  current  procedure,  predictors 
are  optimally  positioned  but  only  in  a climatological 
sense.  It  may  be  profitable  to  investigate  techniques 
that  would  position  predictors  differently  for  each  day 
according  to  the  synoptic  situation. 

We  can  envision  the  above  studies  to  result  in  improve- 
ments to  the  current  development  procedure.  Of  course, 
as  new  data  sources  and  longer  data  samples  become 
available  in  the  future,  these  too  will  result  in  improv- 
ing the  prediction  equations. 

4.  SUMMARY  OF  ACHIEVEMENTS 

In  the  0-2  hr  prediction,  we  have: 

- Completed  tests  and  comparisons  of  10-,  30- , 60- , and  90-minute 
forecasts  of  echo  coverage  by  three  different  tracking  models  of 
varying  complexity,  from  10-  and  30-minute  sequences  of  digitized 
zero  degree  reflectivity  data.  Forecasts  were  verified  over  16  and 
64  nmi^  areas. 

- Tested  and  compared  10-,  30-,  and  60-minute  forecasts,  by  the  same 
models,  of  vertically  Integrated  liquid  water  content  (VIL)  from 
10-minute  input  data  sequences.  Forecasts  were  verified  over  a 

16  nmi^  area. 

- Recommended  the  most  suitable  models  for  different  operational 
applications.  The  0-30  minute  forecast  utilizing  CCM  may  have 
the  accuracy  required  for  use  in  air  traffic  control. 

In  the  2-6  hr  prediction,  we  have: 

- Developed  thunderstorm  probability  equations  for  the  period  1700-2100, 
2000-0000,  and  2300-0300  GMT.  Separate  equations  were  derived  for 
the  spring  and  summer  seasons. 

- Implemented  the  probability  equations  on  an  operational  day-to-day 
basis  from  April  to  late  September  1976.  During  this  period,  the 
forecasts  were  transmitted  by  teletype  to  NWS  and  FAA  stations  by 
1540,  1840,  and  2140  GMT,  respectively. 

- Verified  the  operational  probability  forecasts  against  thunderstorm 
observations. 

5.  RECOMMENDATIONS 

The  FAA  should  adopt  the  0-30  minute  forecaet  of  thunderstorms  for  oper- 
ational tests.  These  operational  tests  would  lead  to  proceduree  on 
controller  use  of  these  forecasts  as  well  as  determine  the  forecast  . 
accuracy  required  for  air  traffic  control. 
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